SFB 823 A subsampled double bootstrap for massive data

نویسندگان

  • Srijan Sengupta
  • Stanislav Volgushev
  • Xiaofeng Shao
چکیده

The bootstrap is a popular and powerful method for assessing precision of estimators and inferential methods. However, for massive datasets which are increasingly prevalent, the bootstrap becomes prohibitively costly in computation and its feasibility is questionable even with modern parallel computing platforms. Recently Kleiner, Talwalkar, Sarkar, and Jordan (2014) proposed a method called BLB (Bag of Little Bootstraps) for massive data which is more computationally scalable with little sacrifice of statistical accuracy. Building on BLB and the idea of fast double bootstrap, we propose a new resampling method, the subsampled double bootstrap, for both independent data and time series data. We establish consistency of the subsampled double bootstrap under mild conditions for both independent and dependent cases. Methodologically, the subsampled double bootstrap is superior to BLB in terms of running time, more sample coverage and automatic implementation with less tuning parameters for a given time budget. Its advantage relative to BLB and bootstrap is also demonstrated in numerical simulations and a data illustration.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A subsampled double bootstrap for massive data

The bootstrap is a popular and powerful method for assessing precision of estimators and inferential methods. However, for massive datasets which are increasingly prevalent, the bootstrap becomes prohibitively costly in computation and its feasibility is questionable even with modern parallel computing platforms. Recently Kleiner, Talwalkar, Sarkar, and Jordan (2014) proposed a method called BL...

متن کامل

A scalable bootstrap for massive data

The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving large datasets—which are increasingly prevalent— the computation of bootstrap-based quantities can be prohibitively demanding computationally. While variants such as subsampling and the m out of n bootstrap can be used in principle to reduce the cost of bootstrap computation...

متن کامل

SFB 823 A fluctuation test for constant Spearman ’ s rho

We propose a CUSUM type test for constant correlation that goes beyond a previously suggested correlation constancy test by considering Spearman’s rho in arbitrary dimensions. By using copula-based expressions, we simultaneously extend a previously suggested copula constancy test. We calculate the asymptotic null distribution using an invariance principle for the sequential empirical copula pro...

متن کامل

SFB 823 Skew - symmetric distributions and Fisher information The double sin of the skew - normal

Hallin and Ley (2012) investigate and fully characterize the Fisher singularity phenomenon in univariate and multivariate families of skew-symmetric distributions. This paper proposes a refined analysis of the (univariate) Fisher degeneracy problem, showing that it can be more or less severe, inducing n (“simple singularity”), n (“double singularity”), or n (“triple singularity”) consistency ra...

متن کامل

SFB 823 Reject inference in consumer credit scoring with nonignorable missing data

We generalize an empirical likelihood approach to missing data to the case of consumer credit scoring and provide a Hausman test for nonignorability of the missings. An application to recent consumer credit data shows that our model yields parameter estimates which are significantly different (both statistically and economically) from the case where customers who were refused credit are ignored.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015